Automatic and lightweight grammar generation for fuzz testing

نویسندگان

  • Su Yong Kim
  • Sung Deok Cha
  • Doo-Hwan Bae
چکیده

Blackbox fuzz testing can only test a small portion of code when rigorously checking the well-formedness of input values. To overcome this problem, blackbox fuzz testing is performed using a grammar that delineates the format information of input values. However, it is almost impossible to manually construct a grammar if the input specifications are not known. We propose an alternative technique: the automatic generation of fuzzing grammars using API-level concolic testing. API-level concolic testing collects constraints at the library function level rather than the instruction level. While API-level concolic testing may be less accurate than instruction-level concolic testing, it is highly useful for speedily generating fuzzing grammars that enhance code coverage for real-world programs. To verify the feasibility of the proposed concept, we implemented the system for generating ActiveX control fuzzing grammars, named YMIR. The experiment results showed that the YMIR system was capable of generating fuzzing grammars that can raise branch coverage for ActiveX control using highly-structured input string by 15e50%. In addition, the YMIR system discovered two new vulnerabilities revealed only when input values are well-formed. Automatic fuzzing grammar generation through API-level concolic testing is not restricted to the testing of ActiveX controls; it should also be applicable to other string processing program whose source code is unavailable. a 2013 Elsevier Ltd. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimizing Cost Function in Imperialist Competitive Algorithm for Path Coverage Problem in Software Testing

Search-based optimization methods have been used for software engineering activities such as software testing. In the field of software testing, search-based test data generation refers to application of meta-heuristic optimization methods to generate test data that cover the code space of a program. Automatic test data generation that can cover all the paths of software is known as a major cha...

متن کامل

Efficient Model-based Fuzz Testing Using Higher-order Attribute Grammars

Format specifications of data input are critical to model-based fuzz testing. Present methods cannot describe the format accurately, which leads to high redundancy in testing practices. In order to improve testing efficiency, we propose a grammar-driven approach to fuzz testing. Firstly, we build a formal model of data format using higher-order attribute grammars, and construct syntax tree on t...

متن کامل

Learn&Fuzz: machine learning for input fuzzing

Fuzzing consists of repeatedly testing an application with modified, or fuzzed, inputs with the goal of finding security vulnerabilities in input-parsing code. In this paper, we show how to automate the generation of an input grammar suitable for input fuzzing using sample inputs and neural-network-based statistical machine-learning techniques. We present a detailed case study with a complex in...

متن کامل

Software Verification: Testing vs. Model Checking

In practice, software testing has been the established method for finding bugs in programs for a long time. But in the last 15 years, software model checking has received a lot of attention, and many successful tools for software model checking exist today. We believe it is time for a careful comparative evaluation of automatic software testing against automatic software model checking. We chos...

متن کامل

Semi-Automatic Generation of a Software Testing Lightweight Ontology from a Glossary Based on the ONTO6 Methodology

We propose a methodology of semi-automatic obtaining of a lightweight ontology for software testing domain based on the glossary “Standard glossary of terms used in Software Testing” created by ISTQB. From the same glossary many ontologies might be developed depending on the strategy for extracting concepts, categorizing them, and determining hierarchical and some other relationships. Initially...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computers & Security

دوره 36  شماره 

صفحات  -

تاریخ انتشار 2013